Overview

Brought to you by YData

Dataset statistics

Number of variables17
Number of observations6730436
Missing cells1275801
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.4 GiB
Average record size in memory705.0 B

Variable types

DateTime2
Categorical1
Text7
Numeric7

Alerts

DestinationLat is highly overall correlated with OriginLat and 1 other fieldsHigh correlation
DestinationLong is highly overall correlated with OriginLong and 1 other fieldsHigh correlation
OriginLat is highly overall correlated with DestinationLat and 1 other fieldsHigh correlation
OriginLong is highly overall correlated with DestinationLong and 1 other fieldsHigh correlation
VehicleLocation.Latitude is highly overall correlated with DestinationLat and 1 other fieldsHigh correlation
VehicleLocation.Longitude is highly overall correlated with DestinationLong and 1 other fieldsHigh correlation
ExpectedArrivalTime has 872302 (13.0%) missing values Missing
ScheduledArrivalTime has 172333 (2.6%) missing values Missing
DistanceFromStop has 376918 (5.6%) zeros Zeros

Reproduction

Analysis started2024-10-18 08:33:20.331476
Analysis finished2024-10-18 08:36:54.932884
Duration3 minutes and 34.6 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

Distinct218287
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size51.3 MiB
Minimum2017-06-01 00:01:18
Maximum2017-06-30 23:53:38
2024-10-18T10:36:55.001950image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:55.098574image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

DirectionRef
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size372.3 MiB
1
3384112 
0
3346324 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6730436
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 3384112
50.3%
0 3346324
49.7%

Length

2024-10-18T10:36:55.187155image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-10-18T10:36:55.270896image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 3384112
50.3%
0 3346324
49.7%

Most occurring characters

ValueCountFrequency (%)
1 3384112
50.3%
0 3346324
49.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6730436
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 3384112
50.3%
0 3346324
49.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6730436
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 3384112
50.3%
0 3346324
49.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6730436
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 3384112
50.3%
0 3346324
49.7%
Distinct326
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size387.9 MiB
2024-10-18T10:36:55.524000image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.4291633
Min length2

Characters and Unicode

Total characters23079764
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB8
2nd rowS61
3rd rowBx10
4th rowQ5
5th rowBx1
ValueCountFrequency (%)
b6 124896
 
1.9%
b41 101068
 
1.5%
q58 94390
 
1.4%
m15-sbs 86625
 
1.3%
q44-sbs 85033
 
1.3%
b35 84248
 
1.3%
q27 83823
 
1.2%
bx36 83653
 
1.2%
b82 80618
 
1.2%
m101 80422
 
1.2%
Other values (316) 5825660
86.6%
2024-10-18T10:36:55.880961image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 4009298
17.4%
1 2338086
10.1%
4 1781490
 
7.7%
S 1617710
 
7.0%
2 1437081
 
6.2%
x 1378807
 
6.0%
M 1363064
 
5.9%
6 1313638
 
5.7%
3 1306050
 
5.7%
5 1257513
 
5.4%
Other values (10) 5277027
22.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 23079764
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
B 4009298
17.4%
1 2338086
10.1%
4 1781490
 
7.7%
S 1617710
 
7.0%
2 1437081
 
6.2%
x 1378807
 
6.0%
M 1363064
 
5.9%
6 1313638
 
5.7%
3 1306050
 
5.7%
5 1257513
 
5.4%
Other values (10) 5277027
22.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 23079764
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
B 4009298
17.4%
1 2338086
10.1%
4 1781490
 
7.7%
S 1617710
 
7.0%
2 1437081
 
6.2%
x 1378807
 
6.0%
M 1363064
 
5.9%
6 1313638
 
5.7%
3 1306050
 
5.7%
5 1257513
 
5.4%
Other values (10) 5277027
22.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 23079764
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
B 4009298
17.4%
1 2338086
10.1%
4 1781490
 
7.7%
S 1617710
 
7.0%
2 1437081
 
6.2%
x 1378807
 
6.0%
M 1363064
 
5.9%
6 1313638
 
5.7%
3 1306050
 
5.7%
5 1257513
 
5.4%
Other values (10) 5277027
22.9%
Distinct606
Distinct (%)< 0.1%
Missing63156
Missing (%)0.9%
Memory size492.4 MiB
2024-10-18T10:36:56.086066image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length39
Median length32
Mean length20.139376
Min length9

Characters and Unicode

Total characters134274857
Distinct characters61
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4 AV/95 ST
2nd rowST GEORGE FERRY/S61 & S91
3rd rowE 206 ST/BAINBRIDGE AV
4th rowTEARDROP/LAYOVER
5th rowRIVERDALE AV/W 231 ST
ValueCountFrequency (%)
st 2503797
 
10.5%
av 2454334
 
10.3%
e 513193
 
2.2%
w 476915
 
2.0%
av/e 349285
 
1.5%
bl 305157
 
1.3%
271726
 
1.1%
rd 233183
 
1.0%
av/w 227171
 
1.0%
george 194066
 
0.8%
Other values (907) 16215704
68.3%
2024-10-18T10:36:56.576349image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17131294
 
12.8%
A 12587959
 
9.4%
T 9160542
 
6.8%
S 8674611
 
6.5%
E 8367741
 
6.2%
R 7713606
 
5.7%
/ 6588481
 
4.9%
L 5959760
 
4.4%
V 5624788
 
4.2%
N 5396222
 
4.0%
Other values (51) 47069853
35.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 134274857
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
17131294
 
12.8%
A 12587959
 
9.4%
T 9160542
 
6.8%
S 8674611
 
6.5%
E 8367741
 
6.2%
R 7713606
 
5.7%
/ 6588481
 
4.9%
L 5959760
 
4.4%
V 5624788
 
4.2%
N 5396222
 
4.0%
Other values (51) 47069853
35.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 134274857
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
17131294
 
12.8%
A 12587959
 
9.4%
T 9160542
 
6.8%
S 8674611
 
6.5%
E 8367741
 
6.2%
R 7713606
 
5.7%
/ 6588481
 
4.9%
L 5959760
 
4.4%
V 5624788
 
4.2%
N 5396222
 
4.0%
Other values (51) 47069853
35.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 134274857
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
17131294
 
12.8%
A 12587959
 
9.4%
T 9160542
 
6.8%
S 8674611
 
6.5%
E 8367741
 
6.2%
R 7713606
 
5.7%
/ 6588481
 
4.9%
L 5959760
 
4.4%
V 5624788
 
4.2%
N 5396222
 
4.0%
Other values (51) 47069853
35.1%

OriginLat
Real number (ℝ)

High correlation 

Distinct662
Distinct (%)< 0.1%
Missing63156
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean40.72961
Minimum40.506882
Maximum40.912365
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size51.3 MiB
2024-10-18T10:36:56.676989image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum40.506882
5-th percentile40.581245
Q140.660664
median40.715233
Q340.807869
95-th percentile40.868076
Maximum40.912365
Range0.405483
Interquartile range (IQR)0.147205

Descriptive statistics

Standard deviation0.090273681
Coefficient of variation (CV)0.0022164141
Kurtosis-0.84679197
Mean40.72961
Median Absolute Deviation (MAD)0.071579
Skewness-0.013067871
Sum2.7155571 × 108
Variance0.0081493375
MonotonicityNot monotonic
2024-10-18T10:36:56.782140image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.704906 93956
 
1.4%
40.80323 92422
 
1.4%
40.761806 68468
 
1.0%
40.59351 68366
 
1.0%
40.731342 67586
 
1.0%
40.849113 64310
 
1.0%
40.609566 63932
 
0.9%
40.701748 63415
 
0.9%
40.729568 58892
 
0.9%
40.760429 54070
 
0.8%
Other values (652) 5971863
88.7%
(Missing) 63156
 
0.9%
ValueCountFrequency (%)
40.506882 10666
0.2%
40.508942 5513
0.1%
40.510155 46
 
< 0.1%
40.523865 111
 
< 0.1%
40.526684 347
 
< 0.1%
40.526787 47
 
< 0.1%
40.526802 99
 
< 0.1%
40.526962 158
 
< 0.1%
40.526997 39
 
< 0.1%
40.527 2213
 
< 0.1%
ValueCountFrequency (%)
40.912365 38692
0.6%
40.910107 27265
0.4%
40.903339 20772
0.3%
40.90266 20560
0.3%
40.900444 114
 
< 0.1%
40.893269 1211
 
< 0.1%
40.889969 11803
 
0.2%
40.888432 169
 
< 0.1%
40.888401 12191
 
0.2%
40.888393 37
 
< 0.1%

OriginLong
Real number (ℝ)

High correlation 

Distinct660
Distinct (%)< 0.1%
Missing63156
Missing (%)0.9%
Infinite0
Infinite (%)0.0%
Mean-73.93111
Minimum-74.248062
Maximum-73.701866
Zeros0
Zeros (%)0.0%
Negative6667280
Negative (%)99.1%
Memory size51.3 MiB
2024-10-18T10:36:56.882785image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-74.248062
5-th percentile-74.112297
Q1-73.987373
median-73.932449
Q3-73.879936
95-th percentile-73.781059
Maximum-73.701866
Range0.546196
Interquartile range (IQR)0.107437

Descriptive statistics

Standard deviation0.094275259
Coefficient of variation (CV)-0.0012751771
Kurtosis0.867808
Mean-73.93111
Median Absolute Deviation (MAD)0.054115
Skewness-0.32947249
Sum-4.9291941 × 108
Variance0.0088878245
MonotonicityNot monotonic
2024-10-18T10:36:56.978838image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.793304 93956
 
1.4%
-73.932449 92422
 
1.4%
-73.878334 79718
 
1.2%
-73.829559 68468
 
1.0%
-73.993996 68366
 
1.0%
-73.990288 67586
 
1.0%
-73.937752 64310
 
1.0%
-73.921814 63932
 
0.9%
-73.802399 63415
 
0.9%
-73.990051 58892
 
0.9%
Other values (650) 5946215
88.3%
(Missing) 63156
 
0.9%
ValueCountFrequency (%)
-74.248062 46
 
< 0.1%
-74.246948 5513
 
0.1%
-74.232979 10666
 
0.2%
-74.226295 30541
0.5%
-74.220856 2388
 
< 0.1%
-74.216454 145
 
< 0.1%
-74.208282 14
 
< 0.1%
-74.208115 10
 
< 0.1%
-74.207069 60
 
< 0.1%
-74.199394 1580
 
< 0.1%
ValueCountFrequency (%)
-73.701866 23068
0.3%
-73.706421 17653
0.3%
-73.708687 7153
 
0.1%
-73.712791 110
 
< 0.1%
-73.720215 7624
 
0.1%
-73.723259 16634
0.2%
-73.7248 675
 
< 0.1%
-73.72506 3860
 
0.1%
-73.725151 738
 
< 0.1%
-73.726044 10910
0.2%
Distinct778
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size535.6 MiB
2024-10-18T10:36:57.178173image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length68
Median length55
Mean length26.4411
Min length4

Characters and Unicode

Total characters177960128
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowBROWNSVILLE ROCKAWAY AV
2nd rowS I MALL YUKON AV
3rd rowRIVERDALE 263 ST
4th rowROSEDALE LIRR STA via MERRICK
5th rowMOTT HAVEN 136 ST via CONCOURSE
ValueCountFrequency (%)
via 4084130
 
11.4%
av 2689657
 
7.5%
st 2645218
 
7.4%
bus 475173
 
1.3%
sta 473147
 
1.3%
select 467312
 
1.3%
ltd 461782
 
1.3%
jamaica 401099
 
1.1%
bay 394768
 
1.1%
bl 335273
 
0.9%
Other values (761) 23448925
65.4%
2024-10-18T10:36:57.499146image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
29363798
16.5%
A 12162410
 
6.8%
S 11606550
 
6.5%
T 10981081
 
6.2%
E 10201454
 
5.7%
R 9245861
 
5.2%
N 7784506
 
4.4%
L 7693210
 
4.3%
I 6705936
 
3.8%
O 6462285
 
3.6%
Other values (50) 65753037
36.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 177960128
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
29363798
16.5%
A 12162410
 
6.8%
S 11606550
 
6.5%
T 10981081
 
6.2%
E 10201454
 
5.7%
R 9245861
 
5.2%
N 7784506
 
4.4%
L 7693210
 
4.3%
I 6705936
 
3.8%
O 6462285
 
3.6%
Other values (50) 65753037
36.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 177960128
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
29363798
16.5%
A 12162410
 
6.8%
S 11606550
 
6.5%
T 10981081
 
6.2%
E 10201454
 
5.7%
R 9245861
 
5.2%
N 7784506
 
4.4%
L 7693210
 
4.3%
I 6705936
 
3.8%
O 6462285
 
3.6%
Other values (50) 65753037
36.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 177960128
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
29363798
16.5%
A 12162410
 
6.8%
S 11606550
 
6.5%
T 10981081
 
6.2%
E 10201454
 
5.7%
R 9245861
 
5.2%
N 7784506
 
4.4%
L 7693210
 
4.3%
I 6705936
 
3.8%
O 6462285
 
3.6%
Other values (50) 65753037
36.9%

DestinationLat
Real number (ℝ)

High correlation 

Distinct530
Distinct (%)< 0.1%
Missing10346
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean40.728634
Minimum40.508106
Maximum40.912376
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size51.3 MiB
2024-10-18T10:36:57.596618image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum40.508106
5-th percentile40.580952
Q140.660854
median40.713356
Q340.807545
95-th percentile40.868797
Maximum40.912376
Range0.40427
Interquartile range (IQR)0.146691

Descriptive statistics

Standard deviation0.090072137
Coefficient of variation (CV)0.0022115187
Kurtosis-0.82145738
Mean40.728634
Median Absolute Deviation (MAD)0.069584
Skewness0.0081833507
Sum2.7370009 × 108
Variance0.0081129899
MonotonicityNot monotonic
2024-10-18T10:36:57.702953image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.704933 94435
 
1.4%
40.809654 79753
 
1.2%
40.761745 70740
 
1.1%
40.643585 66830
 
1.0%
40.592949 63341
 
0.9%
40.609142 61441
 
0.9%
40.849033 59350
 
0.9%
40.699776 59280
 
0.9%
40.701683 57233
 
0.9%
40.80315 56679
 
0.8%
Other values (520) 6051008
89.9%
ValueCountFrequency (%)
40.508106 9928
 
0.1%
40.508942 5630
 
0.1%
40.50901 308
 
< 0.1%
40.510319 10
 
< 0.1%
40.526684 190
 
< 0.1%
40.526745 25
 
< 0.1%
40.526806 2614
 
< 0.1%
40.526836 568
 
< 0.1%
40.527 99
 
< 0.1%
40.53006 32539
0.5%
ValueCountFrequency (%)
40.912376 35818
0.5%
40.91008 31233
0.5%
40.903309 24316
0.4%
40.902779 20788
0.3%
40.900753 56
 
< 0.1%
40.893486 1259
 
< 0.1%
40.889843 11224
 
0.2%
40.888496 13720
 
0.2%
40.885086 17961
0.3%
40.881062 20774
0.3%

DestinationLong
Real number (ℝ)

High correlation 

Distinct531
Distinct (%)< 0.1%
Missing10346
Missing (%)0.2%
Infinite0
Infinite (%)0.0%
Mean-73.931565
Minimum-74.248192
Maximum-73.701385
Zeros0
Zeros (%)0.0%
Negative6720090
Negative (%)99.8%
Memory size51.3 MiB
2024-10-18T10:36:57.802787image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-74.248192
5-th percentile-74.113136
Q1-73.989311
median-73.932266
Q3-73.878326
95-th percentile-73.779564
Maximum-73.701385
Range0.546807
Interquartile range (IQR)0.110985

Descriptive statistics

Standard deviation0.095103662
Coefficient of variation (CV)-0.0012863743
Kurtosis0.81906665
Mean-73.931565
Median Absolute Deviation (MAD)0.055145
Skewness-0.31051508
Sum-4.9682677 × 108
Variance0.0090447066
MonotonicityNot monotonic
2024-10-18T10:36:57.898290image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.79332 94435
 
1.4%
-73.92836 79753
 
1.2%
-73.829529 70740
 
1.1%
-74.072609 66830
 
1.0%
-73.993385 63341
 
0.9%
-73.92144 61441
 
0.9%
-73.937309 59350
 
0.9%
-73.941505 59280
 
0.9%
-73.802475 57233
 
0.9%
-73.932266 56679
 
0.8%
Other values (521) 6051008
89.9%
ValueCountFrequency (%)
-74.248192 10
 
< 0.1%
-74.246948 5630
 
0.1%
-74.24675 308
 
< 0.1%
-74.230072 9928
 
0.1%
-74.226654 32539
0.5%
-74.220345 2504
 
< 0.1%
-74.208282 46
 
< 0.1%
-74.208115 74
 
< 0.1%
-74.206604 5
 
< 0.1%
-74.199394 8246
 
0.1%
ValueCountFrequency (%)
-73.701385 2441
 
< 0.1%
-73.701454 21360
0.3%
-73.706589 19743
0.3%
-73.70871 7841
 
0.1%
-73.712563 184
 
< 0.1%
-73.720772 8479
 
0.1%
-73.723145 18350
0.3%
-73.725174 4330
 
0.1%
-73.725914 11636
0.2%
-73.727409 17074
0.3%
Distinct5719
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size423.1 MiB
2024-10-18T10:36:58.103947image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length8.9097703
Min length8

Characters and Unicode

Total characters59966639
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)< 0.1%

Sample

1st rowNYCT_430
2nd rowNYCT_8263
3rd rowNYCT_4223
4th rowNYCT_8422
5th rowNYCT_4710
ValueCountFrequency (%)
nyct_5860 2978
 
< 0.1%
nyct_1238 2866
 
< 0.1%
nyct_6081 2857
 
< 0.1%
nyct_7107 2831
 
< 0.1%
nyct_1261 2827
 
< 0.1%
nyct_6061 2822
 
< 0.1%
nyct_6043 2808
 
< 0.1%
nyct_5997 2807
 
< 0.1%
nyct_5857 2806
 
< 0.1%
nyct_4573 2803
 
< 0.1%
Other values (5709) 6702031
99.6%
2024-10-18T10:36:58.402102image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
T 6730436
11.2%
C 6730436
11.2%
_ 6730436
11.2%
N 6718018
11.2%
Y 6718018
11.2%
4 3437605
 
5.7%
7 3226239
 
5.4%
5 3123108
 
5.2%
6 2995867
 
5.0%
8 2778027
 
4.6%
Other values (8) 10778449
18.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 59966639
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
T 6730436
11.2%
C 6730436
11.2%
_ 6730436
11.2%
N 6718018
11.2%
Y 6718018
11.2%
4 3437605
 
5.7%
7 3226239
 
5.4%
5 3123108
 
5.2%
6 2995867
 
5.0%
8 2778027
 
4.6%
Other values (8) 10778449
18.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 59966639
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
T 6730436
11.2%
C 6730436
11.2%
_ 6730436
11.2%
N 6718018
11.2%
Y 6718018
11.2%
4 3437605
 
5.7%
7 3226239
 
5.4%
5 3123108
 
5.2%
6 2995867
 
5.0%
8 2778027
 
4.6%
Other values (8) 10778449
18.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 59966639
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
T 6730436
11.2%
C 6730436
11.2%
_ 6730436
11.2%
N 6718018
11.2%
Y 6718018
11.2%
4 3437605
 
5.7%
7 3226239
 
5.4%
5 3123108
 
5.2%
6 2995867
 
5.0%
8 2778027
 
4.6%
Other values (8) 10778449
18.0%

VehicleLocation.Latitude
Real number (ℝ)

High correlation 

Distinct360308
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.728485
Minimum40.502879
Maximum40.932657
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size51.3 MiB
2024-10-18T10:36:58.505201image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum40.502879
5-th percentile40.592907
Q140.659481
median40.723225
Q340.803181
95-th percentile40.866339
Maximum40.932657
Range0.429778
Interquartile range (IQR)0.1437

Descriptive statistics

Standard deviation0.086818631
Coefficient of variation (CV)0.002131644
Kurtosis-0.92838433
Mean40.728485
Median Absolute Deviation (MAD)0.070183
Skewness0.026693405
Sum2.7412046 × 108
Variance0.0075374746
MonotonicityNot monotonic
2024-10-18T10:36:58.609180image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.714687 3540
 
0.1%
40.643562 2829
 
< 0.1%
40.807584 2793
 
< 0.1%
40.73029 2535
 
< 0.1%
40.707646 2483
 
< 0.1%
40.820521 2482
 
< 0.1%
40.576997 2320
 
< 0.1%
40.593481 2317
 
< 0.1%
40.795177 2301
 
< 0.1%
40.730118 2274
 
< 0.1%
Other values (360298) 6704562
99.6%
ValueCountFrequency (%)
40.502879 6
 
< 0.1%
40.502881 11
< 0.1%
40.502882 1
 
< 0.1%
40.502883 18
< 0.1%
40.502886 1
 
< 0.1%
40.50289 7
 
< 0.1%
40.502898 16
< 0.1%
40.502901 2
 
< 0.1%
40.502905 4
 
< 0.1%
40.50291 7
 
< 0.1%
ValueCountFrequency (%)
40.932657 1
< 0.1%
40.932079 1
< 0.1%
40.932013 1
< 0.1%
40.932002 1
< 0.1%
40.927591 1
< 0.1%
40.918263 1
< 0.1%
40.917285 1
< 0.1%
40.916806 1
< 0.1%
40.916289 1
< 0.1%
40.912675 1
< 0.1%

VehicleLocation.Longitude
Real number (ℝ)

High correlation 

Distinct426793
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.930712
Minimum-74.252339
Maximum-73.701414
Zeros0
Zeros (%)0.0%
Negative6730436
Negative (%)100.0%
Memory size51.3 MiB
2024-10-18T10:36:58.717597image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-74.252339
5-th percentile-74.100952
Q1-73.979298
median-73.936528
Q3-73.882413
95-th percentile-73.776967
Maximum-73.701414
Range0.550925
Interquartile range (IQR)0.096885

Descriptive statistics

Standard deviation0.089140224
Coefficient of variation (CV)-0.0012057266
Kurtosis0.77288539
Mean-73.930712
Median Absolute Deviation (MAD)0.045881
Skewness-0.24948622
Sum-4.9758593 × 108
Variance0.0079459796
MonotonicityNot monotonic
2024-10-18T10:36:58.819869image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.831286 3543
 
0.1%
-73.990592 3460
 
0.1%
-74.072656 2821
 
< 0.1%
-73.990411 2781
 
< 0.1%
-73.795448 2483
 
< 0.1%
-73.851537 2469
 
< 0.1%
-73.994042 2343
 
< 0.1%
-73.973036 2286
 
< 0.1%
-73.991257 2238
 
< 0.1%
-73.939805 2236
 
< 0.1%
Other values (426783) 6703776
99.6%
ValueCountFrequency (%)
-74.252339 1
 
< 0.1%
-74.252338 2
< 0.1%
-74.252337 3
< 0.1%
-74.252336 1
 
< 0.1%
-74.252335 4
< 0.1%
-74.252333 4
< 0.1%
-74.252328 3
< 0.1%
-74.252326 4
< 0.1%
-74.252325 3
< 0.1%
-74.252324 1
 
< 0.1%
ValueCountFrequency (%)
-73.701414 6
 
< 0.1%
-73.701416 1
 
< 0.1%
-73.701417 2
 
< 0.1%
-73.701492 208
< 0.1%
-73.701494 6
 
< 0.1%
-73.701496 3
 
< 0.1%
-73.701499 1
 
< 0.1%
-73.701548 140
< 0.1%
-73.701551 29
 
< 0.1%
-73.701553 290
< 0.1%
Distinct10894
Distinct (%)0.2%
Missing7002
Missing (%)0.1%
Memory size491.1 MiB
2024-10-18T10:36:59.180597image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length46
Median length39
Mean length19.558554
Min length5

Characters and Unicode

Total characters131500644
Distinct characters72
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique361 ?
Unique (%)< 0.1%

Sample

1st rowFOSTER AV/E 18 ST
2nd rowMERRYMOUNT ST/TRAVIS AV
3rd rowHENRY HUDSON PKY E/W 235 ST
4th rowHOOK CREEK BL/SUNRISE HY
5th rowGRAND CONCOURSE/E 196 ST
ValueCountFrequency (%)
av 2701038
 
11.2%
st 2680521
 
11.1%
e 580864
 
2.4%
av/e 564519
 
2.3%
w 426581
 
1.8%
bl 341588
 
1.4%
av/w 312331
 
1.3%
rd 297410
 
1.2%
pl 199718
 
0.8%
3 151044
 
0.6%
Other values (5618) 15920884
65.9%
2024-10-18T10:36:59.494069image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17477062
 
13.3%
A 12264159
 
9.3%
T 8534919
 
6.5%
S 8445715
 
6.4%
E 7674584
 
5.8%
/ 6695424
 
5.1%
R 6566497
 
5.0%
V 6409844
 
4.9%
N 5713648
 
4.3%
L 5292393
 
4.0%
Other values (62) 46426399
35.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 131500644
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
17477062
 
13.3%
A 12264159
 
9.3%
T 8534919
 
6.5%
S 8445715
 
6.4%
E 7674584
 
5.8%
/ 6695424
 
5.1%
R 6566497
 
5.0%
V 6409844
 
4.9%
N 5713648
 
4.3%
L 5292393
 
4.0%
Other values (62) 46426399
35.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 131500644
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
17477062
 
13.3%
A 12264159
 
9.3%
T 8534919
 
6.5%
S 8445715
 
6.4%
E 7674584
 
5.8%
/ 6695424
 
5.1%
R 6566497
 
5.0%
V 6409844
 
4.9%
N 5713648
 
4.3%
L 5292393
 
4.0%
Other values (62) 46426399
35.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 131500644
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
17477062
 
13.3%
A 12264159
 
9.3%
T 8534919
 
6.5%
S 8445715
 
6.4%
E 7674584
 
5.8%
/ 6695424
 
5.1%
R 6566497
 
5.0%
V 6409844
 
4.9%
N 5713648
 
4.3%
L 5292393
 
4.0%
Other values (62) 46426399
35.3%
Distinct210
Distinct (%)< 0.1%
Missing7002
Missing (%)0.1%
Memory size433.6 MiB
2024-10-18T10:36:59.732589image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length10.583211
Min length7

Characters and Unicode

Total characters71155523
Distinct characters29
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowapproaching
2nd rowapproaching
3rd rowat stop
4th row< 1 stop away
5th rowat stop
ValueCountFrequency (%)
stop 3972243
26.0%
approaching 2535937
16.6%
away 2287603
15.0%
1 2072349
13.6%
2072171
13.6%
at 1899894
12.4%
miles 215254
 
1.4%
0.6 38387
 
0.3%
0.5 26686
 
0.2%
0.7 23021
 
0.2%
Other values (203) 127160
 
0.8%
2024-10-18T10:37:00.051580image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 11546974
16.2%
p 9044117
12.7%
8547271
12.0%
o 6508180
9.1%
t 5872137
 
8.3%
s 4187497
 
5.9%
i 2751191
 
3.9%
c 2535937
 
3.6%
r 2535937
 
3.6%
g 2535937
 
3.6%
Other values (19) 15090345
21.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 71155523
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 11546974
16.2%
p 9044117
12.7%
8547271
12.0%
o 6508180
9.1%
t 5872137
 
8.3%
s 4187497
 
5.9%
i 2751191
 
3.9%
c 2535937
 
3.6%
r 2535937
 
3.6%
g 2535937
 
3.6%
Other values (19) 15090345
21.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 71155523
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 11546974
16.2%
p 9044117
12.7%
8547271
12.0%
o 6508180
9.1%
t 5872137
 
8.3%
s 4187497
 
5.9%
i 2751191
 
3.9%
c 2535937
 
3.6%
r 2535937
 
3.6%
g 2535937
 
3.6%
Other values (19) 15090345
21.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 71155523
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 11546974
16.2%
p 9044117
12.7%
8547271
12.0%
o 6508180
9.1%
t 5872137
 
8.3%
s 4187497
 
5.9%
i 2751191
 
3.9%
c 2535937
 
3.6%
r 2535937
 
3.6%
g 2535937
 
3.6%
Other values (19) 15090345
21.2%

DistanceFromStop
Real number (ℝ)

Zeros 

Distinct16378
Distinct (%)0.2%
Missing7002
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean225.88129
Minimum0
Maximum35910
Zeros376918
Zeros (%)5.6%
Negative0
Negative (%)0.0%
Memory size51.3 MiB
2024-10-18T10:37:00.148101image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q122
median89
Q3198
95-th percentile591
Maximum35910
Range35910
Interquartile range (IQR)176

Descriptive statistics

Standard deviation997.59162
Coefficient of variation (CV)4.4164421
Kurtosis466.18625
Mean225.88129
Median Absolute Deviation (MAD)76
Skewness19.186191
Sum1.5186979 × 109
Variance995189.03
MonotonicityNot monotonic
2024-10-18T10:37:00.249151image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 376918
 
5.6%
3 102020
 
1.5%
4 86779
 
1.3%
5 81394
 
1.2%
2 79675
 
1.2%
7 76552
 
1.1%
6 75339
 
1.1%
8 73854
 
1.1%
9 70356
 
1.0%
10 69676
 
1.0%
Other values (16368) 5630871
83.7%
ValueCountFrequency (%)
0 376918
5.6%
1 12357
 
0.2%
2 79675
 
1.2%
3 102020
 
1.5%
4 86779
 
1.3%
5 81394
 
1.2%
6 75339
 
1.1%
7 76552
 
1.1%
8 73854
 
1.1%
9 70356
 
1.0%
ValueCountFrequency (%)
35910 1
 
< 0.1%
35868 1
 
< 0.1%
33608 3
 
< 0.1%
33607 3
 
< 0.1%
33605 5
< 0.1%
33604 5
< 0.1%
33601 2
 
< 0.1%
33596 8
< 0.1%
33593 2
 
< 0.1%
33588 1
 
< 0.1%

ExpectedArrivalTime
Date

Missing 

Distinct733218
Distinct (%)12.5%
Missing872302
Missing (%)13.0%
Memory size51.3 MiB
Minimum2017-06-01 00:03:55
Maximum2017-07-01 00:01:25
2024-10-18T10:37:00.349209image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:37:00.450809image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

ScheduledArrivalTime
Text

Missing 

Distinct93232
Distinct (%)1.4%
Missing172333
Missing (%)2.6%
Memory size411.8 MiB
2024-10-18T10:37:00.666761image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters52464824
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2490 ?
Unique (%)< 0.1%

Sample

1st row24:06:14
2nd row23:58:02
3rd row24:00:53
4th row24:03:00
5th row23:59:38
ValueCountFrequency (%)
08:47:00 2626
 
< 0.1%
08:29:00 2577
 
< 0.1%
08:17:00 2571
 
< 0.1%
08:07:00 2520
 
< 0.1%
08:05:00 2509
 
< 0.1%
08:14:00 2478
 
< 0.1%
07:55:00 2444
 
< 0.1%
07:59:00 2432
 
< 0.1%
07:25:00 2422
 
< 0.1%
08:25:00 2418
 
< 0.1%
Other values (93222) 6533106
99.6%
2024-10-18T10:37:00.972907image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
: 13116206
25.0%
0 8985015
17.1%
1 7365101
14.0%
2 4503566
 
8.6%
5 3655433
 
7.0%
4 3559613
 
6.8%
3 3555432
 
6.8%
7 2003873
 
3.8%
8 1998956
 
3.8%
6 1886992
 
3.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 52464824
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
: 13116206
25.0%
0 8985015
17.1%
1 7365101
14.0%
2 4503566
 
8.6%
5 3655433
 
7.0%
4 3559613
 
6.8%
3 3555432
 
6.8%
7 2003873
 
3.8%
8 1998956
 
3.8%
6 1886992
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 52464824
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
: 13116206
25.0%
0 8985015
17.1%
1 7365101
14.0%
2 4503566
 
8.6%
5 3655433
 
7.0%
4 3559613
 
6.8%
3 3555432
 
6.8%
7 2003873
 
3.8%
8 1998956
 
3.8%
6 1886992
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 52464824
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
: 13116206
25.0%
0 8985015
17.1%
1 7365101
14.0%
2 4503566
 
8.6%
5 3655433
 
7.0%
4 3559613
 
6.8%
3 3555432
 
6.8%
7 2003873
 
3.8%
8 1998956
 
3.8%
6 1886992
 
3.6%

Interactions

2024-10-18T10:36:23.260009image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:42.451334image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:49.301097image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:56.120387image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:02.982063image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:09.797281image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:16.628713image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:24.178110image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:43.421784image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:50.282897image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:57.151609image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:03.955391image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:10.792329image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:17.608918image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:25.095269image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:44.416426image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:51.240133image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:58.098911image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:04.879308image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:11.773379image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:18.565995image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:26.040281image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:45.407961image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:52.210400image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:59.074064image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:05.852935image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:12.760712image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:19.551557image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:27.049004image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:46.398177image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:53.176919image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:00.085858image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:06.872839image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:13.652510image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:20.469489image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:27.960604image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:47.360134image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:54.142351image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:01.061507image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:07.861086image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:14.558103image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:21.372552image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:28.870480image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:48.308913image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:35:55.137572image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:01.998800image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:08.841022image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:15.662123image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-10-18T10:36:22.322621image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-10-18T10:37:01.048206image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
DestinationLatDestinationLongDirectionRefDistanceFromStopOriginLatOriginLongVehicleLocation.LatitudeVehicleLocation.Longitude
DestinationLat1.0000.4120.3460.0060.7230.3170.8910.397
DestinationLong0.4121.0000.402-0.0450.3150.5720.3880.843
DirectionRef0.3460.4021.0000.0150.3470.3840.0180.036
DistanceFromStop0.006-0.0450.0151.0000.014-0.0460.006-0.049
OriginLat0.7230.3150.3470.0141.0000.4000.8880.391
OriginLong0.3170.5720.384-0.0460.4001.0000.3870.819
VehicleLocation.Latitude0.8910.3880.0180.0060.8880.3871.0000.442
VehicleLocation.Longitude0.3970.8430.036-0.0490.3910.8190.4421.000

Missing values

2024-10-18T10:36:30.554659image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-10-18T10:36:35.612542image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-10-18T10:36:46.407192image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

RecordedAtTimeDirectionRefPublishedLineNameOriginNameOriginLatOriginLongDestinationNameDestinationLatDestinationLongVehicleRefVehicleLocation.LatitudeVehicleLocation.LongitudeNextStopPointNameArrivalProximityTextDistanceFromStopExpectedArrivalTimeScheduledArrivalTime
02017-06-01 00:03:340B84 AV/95 ST40.616104-74.031143BROWNSVILLE ROCKAWAY AV40.656048-73.907379NYCT_43040.635170-73.960803FOSTER AV/E 18 STapproaching76.02017-06-01 00:03:5924:06:14
12017-06-01 00:03:431S61ST GEORGE FERRY/S61 & S9140.643169-74.073494S I MALL YUKON AV40.575935-74.167686NYCT_826340.590802-74.158340MERRYMOUNT ST/TRAVIS AVapproaching62.02017-06-01 00:03:5623:58:02
22017-06-01 00:03:490Bx10E 206 ST/BAINBRIDGE AV40.875008-73.880142RIVERDALE 263 ST40.912376-73.902534NYCT_422340.886010-73.912647HENRY HUDSON PKY E/W 235 STat stop5.02017-06-01 00:03:5624:00:53
32017-06-01 00:03:310Q5TEARDROP/LAYOVER40.701748-73.802399ROSEDALE LIRR STA via MERRICK40.666012-73.735939NYCT_842240.668002-73.729348HOOK CREEK BL/SUNRISE HY< 1 stop away267.02017-06-01 00:04:0324:03:00
42017-06-01 00:03:221Bx1RIVERDALE AV/W 231 ST40.881187-73.909340MOTT HAVEN 136 ST via CONCOURSE40.809654-73.928360NYCT_471040.868134-73.893032GRAND CONCOURSE/E 196 STat stop11.02017-06-01 00:03:5623:59:38
52017-06-01 00:03:400M14 AV/E 10 ST40.731342-73.990288HARLEM 147 ST via MADISON40.821110-73.935898NYCT_383140.792897-73.950023MADISON AV/E 106 STapproaching73.02017-06-01 00:03:5624:02:35
62017-06-01 00:03:240B31GERRITSEN AV/GERRITSEN BEACH40.587101-73.918503MIDWOOD KINGS HWY STA40.608433-73.957100NYCT_461140.587024-73.918623GERRITSEN AV/GERRITSEN BEACHat stop0.0NaN24:08:00
72017-06-01 00:03:290B83GATEWAY CTR TERM/GATEWAY DR40.652649-73.877029BWAY JCT VN SNDRN AV40.678139-73.903572NYCT_484140.648801-73.882682PENNSYLVANIA AV/DELMAR LOOP N< 1 stop away196.02017-06-01 00:04:1323:58:47
82017-06-01 00:03:270B82STILLWELL TERMINAL BUS LOOP40.577080-73.981293SPRING CRK TWRS SEAVIEW AV via KINGS HWY40.642990-73.878326NYCT_659240.632258-73.918318FLATLANDS AV/RALPH AVapproaching35.02017-06-01 00:03:5624:00:00
92017-06-01 00:03:511S59RICHMOND TER/PARK AV #340.640167-74.130966HYLAN BL40.534260-74.154213NYCT_827940.590689-74.165811RICHMOND AV/NOME AVapproaching31.02017-06-01 00:03:5624:01:14
RecordedAtTimeDirectionRefPublishedLineNameOriginNameOriginLatOriginLongDestinationNameDestinationLatDestinationLongVehicleRefVehicleLocation.LatitudeVehicleLocation.LongitudeNextStopPointNameArrivalProximityTextDistanceFromStopExpectedArrivalTimeScheduledArrivalTime
67304262017-06-30 23:53:080M23-SBS12 AV/W 23 ST40.748718-74.008110SELECT BUS EAST SIDE AVENUE C CROSSTOWN40.733006-73.974594NYCT_585940.749455-74.008044W 24 ST/12 AVapproaching143.02017-06-30 23:53:3923:30:16
67304272017-06-30 23:53:111M15E 126 ST/2 AV40.803230-73.932449SOUTH FERRY via 2 AV40.701611-74.012230NYCT_610740.712605-73.992562MADISON ST/MARKET ST< 1 stop away203.02017-06-30 23:54:5723:46:22
67304282017-06-30 23:53:240B65SMITH ST/FULTON ST40.691208-73.987373CROWN HTS RALPH AV40.670197-73.922546NYCT_493140.675349-73.927717ROCHESTER AV/BERGEN STapproaching85.02017-06-30 23:53:4923:30:18
67304292017-06-30 23:53:120M104W 41 ST/8 AV40.756550-73.990120W HARLEM 129 ST via BWAY40.814907-73.955048NYCT_670540.767813-73.981383Columbus Circle (does not stop)approaching44.02017-06-30 23:53:3923:47:00
67304302017-06-30 23:53:200Bx11W 179 ST/BROADWAY40.849113-73.937752W FARMS RD SOUTHERN BL40.825272-73.891426NYCT_70040.840354-73.922210W 170 ST/EDWARD L GRANT HYapproaching123.02017-06-30 23:54:1324:09:34
67304312017-06-30 23:53:370B54JAY ST/MYRTLE PLZ40.694504-73.987122RIDGEWOOD TERM via MYRTLE40.700527-73.910149NYCT_444240.699765-73.911974GATES AV/WYCKOFF AVapproaching47.02017-06-30 23:53:5223:44:12
67304322017-06-30 23:53:131M79-SBSE 79 ST/EAST END AV40.770741-73.948715SELECT BUS W SIDE RIVERSIDE DR CROSSTOWN40.784988-73.982460NYCT_592740.770681-73.948759E 79 ST/EAST END AVat stop0.0NaN24:02:00
67304332017-06-30 23:53:211M5BROADWAY/W 178 ST40.848522-73.93770631 ST 6 AV40.747791-73.988831NYCT_638840.820420-73.955842W 135 ST/RIVERSIDE DRapproaching120.02017-06-30 23:54:2723:44:16
67304342017-06-30 23:53:340M4W 32 ST/7 AV40.749405-73.991020WASH HTS CABRINI BLV via MADSON via BWAY40.859013-73.934250NYCT_639240.797009-73.948954CENTRAL PK N/5 AVat stop22.02017-06-30 23:53:4223:50:00
67304352017-06-30 23:53:180Bx2LINCOLN AV/E 137 ST40.809616-73.928276KNGSBRDG HTS FT INDEP ST via CONCOURSE40.878746-73.898033NYCT_475040.817354-73.922631E 149 ST/MORRIS AVapproaching37.02017-06-30 23:53:3923:45:15